These functions allows the user to match a reference set of sky coordinates against a comparison set of sky coordinates. The match radius can be varied per source (all matches per source are given within this radius), and mutual best matches are also extracted. coordmatch
should be used for finding multiple matches and coordmatchsing
should be used when trying to find matches around a single source. internalclean
is a utility function that will remove closely duplicated objects via some tiebreak criterion, and is probably only of interest to advanced users trying to clean catalogues that were produced from overlapping frames.
coordmatch(coordref, coordcompare, rad = 2, inunitref = "deg", inunitcompare = "deg",
radunit = "asec", sep = ":", kstart = 10, ignoreexact = FALSE, ignoreinternal=FALSE,
matchextra = FALSE, smallapprox=FALSE)
coordmatchsing(RAref,Decref, coordcompare, rad=2, inunitref = "deg",
inunitcompare="deg", radunit='asec', sep = ":", ignoreexact=FALSE, smallapprox=FALSE)
internalclean(RA, Dec, rad=2, tiebreak, decreasing = FALSE, inunit="deg", radunit='asec',
sep = ":")
For coordmatch this is the reference dataset, i.e. you want to find matches for each object in this catalogue. A minimum two column matrix or data.frame, where column one is the RA and column two the Dec. See matchextra.
The comparison dataset, i.e. you want to find objects in this catalogue that match locations in coordref. A minimum two column matrix or data.frame, where column one is the RA and column two the Dec. If coordcompare is not provided then it is set to coordref automatically. Since this means the user is doing a single table internal match ignoreinternal is automatically set to TRUE (but this can be overridden). See matchextra.
For coordmatchsing
this is the reference RA for the single object of interest.
For coordmatchsing
this is the reference Dec for the single object of interest.
For internalclean
this is a vector of right ascensions for internal cleaning. If RA is a two column structure then the second column is taken to be Dec.
For internalclean
this is a vector of declinations for internal cleaning. If RA is a two column structure then the second column is taken to be Dec.
The matching radius to use. If this is length one then the same radius is used for all objects, otherwise it must be the same length as the number of rows in coordref.
For internalclean
this is a vector of values to determine the preferred source, e.g. something like magnitude of distance to the centre of the origin frame. By default smaller values are considered better, but this can be flipped by setting decreasing=TRUE. If tiebreak is not provided then the first source that appears is considered the better object in the cleaned catalogue.
Determines whether smaller (decreasing=FALSE) or larger (decreasing=TRUE) tiebreak values are considered preferable.
The units of angular coordinate provided for coordref / RAref / Decref. Allowed options are deg for degress, rad for radians and sex for sexigesimal (i.e. HMS for RA and DMS for Deg).
The units of angular coordinate provided for coordcompare. Allowed options are deg for degress, rad for radians and sex for sexigesimal (i.e. HMS for RA and DMS for Deg).
The units of angular coordinate provided for RA and Dec in internalclean
. Allowed options are deg for degress, rad for radians and sex for sexigesimal (i.e. HMS for RA and DMS for Deg).
The unit type for the radius specified. Allowed options are deg for degress, amin for arc minutes, asec for arc seconds and rad for radians.
The number of matching nodes to attempt initial. The code iterates until all matches within the specified radius (rad) have been found, but it works faster if the kstart is close to the maximum number of matches for any coordref object.
Should exact matches be ignored in the output? If TRUE then 0 separation ID matches are set to 0 and the separation is NA. This might be helpful when matching the same table against itself, where you have no interest in finding object matches with respect to themselves.
Should identical row matches be ignored in the output? If TRUE then exact row ID matches are set to 0 and the separation is NA. The bestmatch output will ignore these trivial matchesw also. This only makes sense if coordref and coordcompare are the same table and you are trying to do an internal table match where you do not want the trivial result of rows matching to themselves. Automatically switches to TRUE if coordcompare is not provided.
Should extra columns in coordref and coordcompare be used as part of the N-D match? Extra columns beyond the requried RA and Dec can be provided and these will be used as part of the N-D match. The meaning of rad in this case is not trivial of course since the match is done within a hyper-sphere. When the extra columns have the same value rad can still be interpretted as an angular coordinate match. These extra columns should be appropriately scaled, e.g. you might want to make a 2 arcsec match with an extra magnitude column. In this case even if two objects sit on top of each other on sky, they cannot differ by more than 2 mag in flux to be a match.
Should the small angle approximation of asin(a/b) = a/b be used? If TRUE then some computations may be much faster, since asin is an expensive computation to make for lots of near matches.
The output of coordmatch is a list containing:
The full matrix of matching IDs. The rows are ordered identically to coordref, and the ID value is the row position in coordcompare for the match.
The full matrix of matching separations in the same units as radunit. The rows are ordered identically to coordref, and the sep value is the separation for each matrix location in the ID list object.
Nmatch is a vector giving the total number of matches for each coordref row.
A three column data.frame giving the best matching IDs. Only objects with at least one match are listed. Column 1 (refID) gives the row position from coordref and column 2 (compareID) gives the corresponding best matching row position in coordcompare. Column 3 (sep) gives the separation between the matched ref and compare positions in the same units as radunit.
The output of coordmatchsing is a list containing:
The full vector of matching IDs. The ID values are the row positions in coordcompare for the match.
The full vector of matching separations in the same units as radunit. The sep value is the separation for each vector location in the ID list object.
Total number of matches within the specified radius.
The best matching ID, where the ID value is the row position in coordcompare for the match.
For coordmatch the main matching is done using nn2 that comes as part of the RANN package. coordmatch adds a large amount of sky coordinate oriented functionality beyond the simple implementation of nn2. For single object matches coordmatchsing should be used since it is substantially faster in this regime (making use of direct dot products).
ignoreexact is more strict in a sense since all objects exactly matching are ignored, whereas with ignoreinternal only identical row IDs are interpretted as being the same object.
# NOT RUN {
set.seed(666)
#Here we make objects in a virtual 1 square degree region
mocksky=cbind(runif(1e3), runif(1e3))
#Now we match to find all objects within an arc minute, ignoring self matches
mockmatches=coordmatch(mocksky, mocksky, ignoreexact=TRUE, rad=1, radunit='amin')
#Now we match to find all objects with varying match radii, ignoring self matches
mockmatchesvary=coordmatch(mocksky, mocksky, ignoreexact=TRUE, rad=seq(0,1,length=1e3),
radunit='amin')
#We can do this also by using the internal table match mode:
mockmatchesvary2=coordmatch(mocksky, rad=seq(0,1,length=1e3), radunit='amin')
#Check that this looks the same (should be identical with all zeroes):
summary(mockmatchesvary$bestmatch-mockmatchesvary2$bestmatch)
# }
Run the code above in your browser using DataLab